85 research outputs found

    Practical and Optimal LSH for Angular Distance

    Get PDF
    We show the existence of a Locality-Sensitive Hashing (LSH) family for the angular distance that yields an approximate Near Neighbor Search algorithm with the asymptotically optimal running time exponent. Unlike earlier algorithms with this property (e.g., Spherical LSH [Andoni, Indyk, Nguyen, Razenshteyn 2014], [Andoni, Razenshteyn 2015]), our algorithm is also practical, improving upon the well-studied hyperplane LSH [Charikar, 2002] in practice. We also introduce a multiprobe version of this algorithm, and conduct experimental evaluation on real and synthetic data sets. We complement the above positive results with a fine-grained lower bound for the quality of any LSH family for angular distance. Our lower bound implies that the above LSH family exhibits a trade-off between evaluation time and quality that is close to optimal for a natural class of LSH functions.Comment: 22 pages, an extended abstract is to appear in the proceedings of the 29th Annual Conference on Neural Information Processing Systems (NIPS 2015

    Using Minimum Description Length for Process Mining

    No full text
    In the field of process mining, the goal is to automatically extract process models from event logs. Recently, many algorithms have been proposed for this task. For comparing these models, different quality measures have been proposed. Most of these measures, however, have several disadvantages; they are model-dependent, assume that the model that generated the log is known, or need negative examples of event sequences. In this paper we propose a new measure, based on the minimal description length principle, to evaluate the quality of process models that does not have these disadvantages. To illustrate the properties of the new measure we conduct experiments and discuss the trade-off between model complexity and compression. 1

    Datawetenschappers TU/e voorspellen groei coronabesmettingen per land

    No full text
    Voorspellingen van infecties en dodelijke slachtoffers (drie dagen vooruit en maximum) als gevolg van het coronavirus in Nederland (als geheel en voor provincies) en 13 andere landen. De voorspellingen geven een duidelijk beeld hoe de coronapandemie zich ontwikkelt in Nederland en 13 andere landen, en kan overheden en organisaties in de gezondheidszorg helpen bij het treffen van noodzakelijke maatregelen. Daarnaast kan de informatie bijdragen aan een vollediger en accurater beeld bij het publiek en de media over de coronapandemie. De voorspellingen in provincies dragen bij aan het voorspellen van ziekenhuisopnamen

    Towards EPC Semantics based on State and Context Jan Mendling Wil van der Aalst

    No full text
    Abstract: The semantics of the OR-join have been discussed for some time, in the context of EPCs, but also in the context of other business process modeling languages like YAWL. In this paper, we show that the existing solutions are not satisfactory from the intuition of the modeler. Furthermore, we present a novel approach towards the definition of EPC semantics based on state and context. The approach uses two types of annotations for arcs. Like in some of the other approaches, arcs are annotated with positive and negative tokens. Moreover, each arc has a context status denoting whether a positive token may still arrive. Using a four-phase approach tokens and statuses are propagated thus yielding a new kind of semantics which overcomes some of the wellknown problems related to OR-joins in EPCs.
    corecore